Scalable Collection of Large MPI Traces on Red Storm
نویسنده
چکیده
Gathering large MPI traces and statistics is important for performance analysis and trouble shooting of applications. Traces, with detailed information about each single message an application has sent, are crucial to characterize the message passing behavior of an application. On massively parallel systems like Red Storm the amount of data collected impacts the performance and behavior of the application and is therefore not feasible. We present a new tool to enable the scalable collection of large amounts of data on Red Storm class systems1.
منابع مشابه
ScalaTrace: Tracing, Analysis and Modeling of HPC Codes at Scale
Characterizing the communication behavior of large-scale applications is a difficult and costly task due to code/system complexity and their long execution times. An alternative to running actual codes is to gather their communication traces and then replay them, which facilitates application tuning and future procurements. While past approaches lacked lossless scalable trace collection, we con...
متن کاملImplementation of Open MPI on Red Storm
The Open MPI project provides a high quality MPI implementation available on a wide variety of platforms. This technical report describes the porting of Open MPI to the Cray Red Storm platform (as well as the Cray XT3 platform). While alpha quality, the port already provides acceptable performance. Remaining porting work and future enhancements are also discussed.
متن کاملA Comparison of Three MPI Implementations for Red Storm
Cray Red Storm is a new distributed memory massively parallel computing platform designed to scale to tens of thousands of nodes. Red Storm has a custom network designed around the Cray SeaStar network interface and router. In this paper, we present an evaluation of three different MPI implementations for Red Storm: the vendor-supported MPICH2 implementation, and two other implementations based...
متن کاملScalaTrace: Scalable compression and replay of communication traces for high-performance computing
We contribute an approach that provides orders of magnitude smaller, if not near-constant size, communication traces regardless of the number of nodes while preserving structural information. We introduce intraand inter-node compression techniques of MPI events that are capable of extracting an application’s communication structure. We further present a replay mechanism for the traces generated...
متن کاملApplication Performance on the Tri-Lab Linux Capacity Cluster - TLCC
In a recent acquisition by DOE/NNSA several large capacity computing clusters called TLCC have been installed at the DOE labs: SNL, LANL and LLNL. TLCC architecture with ccNUMA, multi-socket, multi-core nodes, and InfiniBand interconnect, is representative of the trend in HPC architectures. This chapter examines application performance on TLCC contrasting them with Red Storm/Cray XT4. TLCC and ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007